Max-Margin Synchronous Grammar Induction for Machine Translation
نویسندگان
چکیده
Traditional synchronous grammar induction estimates parameters by maximizing likelihood, which only has a loose relation to translation quality. Alternatively, we propose a max-margin estimation approach to discriminatively inducing synchronous grammars for machine translation, which directly optimizes translation quality measured by BLEU. In the max-margin estimation of parameters, we only need to calculate Viterbi translations. This further facilitates the incorporation of various non-local features that are defined on the target side. We test the effectiveness of our max-margin estimation framework on a competitive hierarchical phrase-based system. Experiments show that our max-margin method significantly outperforms the traditional twostep pipeline for synchronous rule extraction by 1.3 BLEU points and is also better than previous max-likelihood estimation method.
منابع مشابه
Synchronous Constituent Context Model for Inducing Bilingual Synchronous Structures
Traditional Statistical Machine Translation (SMT) systems heuristically extract synchronous structures from word alignments, while synchronous grammar induction provides better solutions that can discard heuristic method and directly obtain statistically sound bilingual synchronous structures. This paper proposes Synchronous Constituent Context Model (SCCM) for synchronous grammar induction. Th...
متن کاملBayesian Synchronous Grammar Induction
We present a novel method for inducing synchronous context free grammars (SCFGs) from a corpus of parallel string pairs. SCFGs can model equivalence between strings in terms of substitutions, insertions and deletions, and the reordering of sub-strings. We develop a non-parametric Bayesian model and apply it to a machine translation task, using priors to replace the various heuristics commonly u...
متن کاملUnsupervised Discriminative Induction of Synchronous Grammar for Machine Translation
We present a global log-linear model for synchronous grammar induction, which is capable of incorporating arbitrary features. The parameters in the model are trained in an unsupervised fashion from parallel sentences without word alignments. To make parameter training tractable, we also propose a novel and efficient cube pruning based synchronous parsing algorithm. Using learned synchronous gra...
متن کاملA Bayesian Model for Learning SCFGs with Discontiguous Rules
We describe a nonparametric model and corresponding inference algorithm for learning Synchronous Context Free Grammar derivations for parallel text. The model employs a Pitman-Yor Process prior which uses a novel base distribution over synchronous grammar rules. Through both synthetic grammar induction and statistical machine translation experiments, we show that our model learns complex transl...
متن کاملA Statistical Machine Translation Model Based on a Synthetic Synchronous Grammar
Recently, various synchronous grammars are proposed for syntax-based machine translation, e.g. synchronous context-free grammar and synchronous tree (sequence) substitution grammar, either purely formal or linguistically motivated. Aiming at combining the strengths of different grammars, we describes a synthetic synchronous grammar (SSG), which tentatively in this paper, integrates a synchronou...
متن کامل